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(57) Abstract 

A method for obtaining longer cDNA sequences is provided. The method utilizes a known genomic DNA sequence or a partial 
cDNA sequence, such as can be obtained from GenBank partial cDNAs. Two PCR primers are designed to correspond to the ends of the 
known partial sequence and to anneal to DNA in a cDNA library so as to initiate extension away from the known cDNA and the other 
primer. The primers are added to a cDNA library with appropriate enzymes and extend through additional DNA sequence to produce PCR 
products, which are subsequently purified and sequenced to provide new sequences. The new sequences are then compared with the known 
partial cDNA sequence for areas of overlap, and the sequence is extended beyond the overlapping areas to provide longer DNA sequence. 
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IMPROVED METHOD FOR OBTAINING FULL-LENGTH cDNA SEQUENCES 

TECHNICAL FIELD 

The present invention is in the field of molecular biology 
and more particularly, in the field of recombinant DNA technology. 

BACKGROUND ART 

PCR has become a widely used nucleic acid amplification 
technique since it was first presented by Kary Mullis at the Cold 
Spring Harbor Symposium (Mullis K et al (1986) Cold Spring Harbor 
Symp Quant Biol 51: 263-273). PCR requires that a pair of primers 
be generated from known sequences. However, in many cases, 
sequence is available only from one end of a DNA segment. Several 
methods have been developed to sequence an entire gene once a 
partial nucleotide sequence is available. As more partial cDNA 
sequences become available in the world' s genetic databanks, more 
efficient and economical methods will be sought for then obtaining 
the complete gene. 

PCR has become a widely used technique to complete genes for 
which a partial sequence is already known. Gene-specific primers 
and primers located in the vector into which the cDNAs have been 
cloned are used for this purpose. However, this method is limited 
by the use of primers complementary to vector sequence which is 
common to all clones in the library. This results in an abundance 
of non-specific PCR-products which have to be cloned and 
sequenced. Multiple rounds of amplifications with nested primers 
might be required. These additional operations increase the 
incorporation of errors. 

Gobinda, Turner and Bolander (1993) in PCR Methods and 
Applications 2:318-22 disclose "restriction-site PCR" as a direct 
method of retrieving unknown sequence which is adjacent to a known 
locus by using universal primers. First, genomic DNA is amplified 
in the presence of restriction site oligonucleotides and a primer 
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specific to the known region. Next, those products are subjected 
to a second round of PCR with the same restriction site 
oligonucleotides and another specific primer internal to the first 
one. Subsequently, the products of the last round of PCR are 
5 transcribed with an appropriate RNA polymerase and sequenced with 
a reverse transcriptase and an end-labeled specific primer 
internal to the second specific PCR primer. Gobinda et al. 
present data concerning Factor IX for which they identified a 
conserved stretch of 20 nucleotides in the 3' noncoding region of 
10 the gene. 

Inverse PCR is the first method that reported successful 
acquisition of unknown sequences starting with primers based on a 
known region (Triglia T, Peterson MG, and Kemp DJ (1988) Nucleic 
Acids Res. 16:8186). Inverse PCR employs a strategy in which 

15 several restriction enzymes are used to generate a suitable 

fragment in the known region. The segment is then circularized by 
intramolecular ligation and used as a PCR template with divergent 
primers created from the known region. However, the requirement 
of multiple restriction enzyme digestions followed by multiple 

20 ligations (even before PCR is started) make the procedure slow and 
expensive (Gobinda et al. Supra). 

Capture PCR, first disclosed by Lagerstrom M, Parik J, 
Malmgren H, Stewart J, Patterson U and Landegren U (1991) PCR 
Methods Applic. 1:111-19, is a method for PCR amplification of DNA 

25 fragments adjacent to a known sequence in human and YAC DNA. As 
noted by Gobinda et al. supra, that method also requires multiple 
restriction enzyme digestions and ligation of an engineered 
double-stranded primer before PCR. Although the restriction and 
ligation reactions are carried out simultaneously in this method, 

30 the requirement of extension reaction, immobilization of the 

extended product, two rounds of PCR and purification of template 
prior to sequencing render it cumbersome and time consuming as 
well. 
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Walking PCR, disclosed by Parker JD, Rabinovitch PS, and 
Burmer GC (1991) Nucleic Acids Res 19:3055-60, teaches a method 
for targeted gene walking via PCR. Although this method also 
permits retrieval of unknown sequence, Gobinda et al, supra, note 
5 that it requires oligomer-extension assay followed by 

identification and gel purification of the desired band prior to 
sequencing. Such extra steps again limit the applicability of the 
method. 

The enzymes originally used in PCR were limited in their 
10 ability to reliably amplify long pieces of nucleic acids over 3kb. 
One of the explanations for this limitation seems to be the 
misincorporation of nucleotides ' resulting in non-basepairing 
mismatches which these enzymes often fail to extend. 

Only the mixture of two enzymes, rTth DNA-Polymerase and 
15 Vent, the latter of which has so-called "proofreading" activity, 
and the optimization of amplification conditions finally overcame 
this limitation and made amplification of pieces of DNA of up to 
40kb possible. 

The most common way to identify genes expressed in a certain 
20 tissue at a certain time is the isolation of the mRNA of that 

particular tissue and the conversion of this mRNA into so-called 
cDNA (complementary DNA) . This cDNAs are subsequently cloned into 
a vector (plasmid or Lambda) and amplified by transfection into 
E. coli cells resulting in a so-called cDNA library. 
25 First and most important to researchers attempting to obtain 

a complete gene is that the enzymes used in converting mRNA into 
cDNA are limited in their ability to produce complete copies of 
the existing mRNAs. This requires the researcher to isolate 
multiple cDNA clones of the gene of interest using specific probes 
30 and analyze each of these isolates for a complete cDNA of the gene 
of interest. This process is called screening of cDNA libraries. 

A major problem facing molecular biologists is finding the 
most efficient method to use to obtain a full-length cDNA from a 
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partial sequence. Such sequences are appearing with increasing 
frequency in GenBank, from commercial cDNA libraries and privately 
prepared libraries. The inventive method disclosed herein is a 
contribution to that art. 

DISCLOSURE OF THE INVENTION 
An improved method for extending the DNA sequence of a known 
fragment of DNA sequence is provided. The method may be used for 
extending known DNA sequences of genomic or cDNA origin. The 
method utilizes the polymerase chain reaction (PCR) and includes 
the steps of: 

a) combining a first and second PCR primer with nucleic acid 
from a cDNA library, or pools of cDNA libraries, expected to 
contain said partial cDNA, or said partial . cDNA that has been 
extended, or a genomic library, under conditions suitable for 
synthesis of nucleic acid PCR products from the first and second 
primers, wherein said first and second primers are capable of 
annealing to opposite strands of the partial cDNA or genomic DNA 
and initiating nucleic acid synthesis in an outward manner and 
wherein the first primer is capable of being extended by DNA 
polymerase in an antisense direction and the second primer is 
capable of being extended in a sense direction, 

b) purifying the PCR products, and 

c) identifying extended nucleotide sequences derived from 
said partial cDNA or said genomic DNA. In one embodiment of the 
present invention, the method of identifying the extended 
nucleotide sequences comprises nucleic acid sequencing. In 
another embodiment of the present invention, the method proceeds 
with repeating steps 6a through 6c on the nucleotide sequences 
identified in step 6c. 

In another embodiment of the present invention, there is a 
method for extending the nucleotide sequence of a partial 
complementary DNA (cDNA) using polymerase chain reaction (PCR) , 
comprising the steps of a) combining a first and second PCR primer 
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with nucleic acid from a cDNA library, or pools of cDNA libraries, 
expected to contain said partial cDNA, or said partial cDNA that 
has been extended, or a genomic DNA library, under conditions 
suitable for synthesis of nucleic acid PCR products from the first 
5 and second primers, wherein said first and second primers are 

capable of annealing to. opposite strands of the partial cDNA and 
initiating nucleic acid synthesis in an outward manner and wherein 
the first primer is capable of being extended by DNA polymerase in 
an antisense direction and the second primer is capable of being 
10 extended in a sense direction, 

b) purifying the PCR products, 

c) ligating the purified PCR products under conditions 
suitable for the formation of circular, closed nucleic acid, 

d) transforming a host cell with the circular, closed nucleic 
15 acid and culturing the transformed host cell under conditions 

suitable for growth, 

e) recovering said circular closed nucleic acid from the 
cultured, transformed host cell, and 

f) identifying extended nucleotide sequences derived from 
20 said partial cDNA or said genomic DNA, 

The present invention also provides a method for extending 
known genomic DNA sequences which may be used for the detection 
and amplification of 5' untranslated nucleotide sequences and/or 
promoter sequences. 
25 Also provided is an isolated DNA molecule comprising SEQ ID 

NO: 11, the DNA for a novel human purinergic P2U receptor. 

Also provided is an isolated DNA molecule comprising SEQ ID 
NO: 12, the DNA for a novel human C5a-like seven transmembrane 
receptor. 

These and other objects, advantages and features of the 
present invention will become apparent to those persons skilled in 
the art upon reading the details of the structure, synthesis, 
formulation and usage as more fully set forth below, reference 

- 5 - 
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being made to the accompanyihg figures forming a part hereof. 

BRIEF DESCRIPTION OF DRAWINGS 

Figure 1 is a flow chart of the steps in the inventive 
method. 

Figure 2 shows a typical plasmid obtained from the excision 
process of a lambdaZAP cDNA library. Typically 250-300 base pairs 
of the sequence are obtained in the high-throughput sequence 
operation. The clone is partially sequenced from the 5' end with 
T3 as a sequencing primer. 

Figure 3 is a representation of the next step, in which 
pBLUESCRIPT SK plasmids in a cDN£ library are used as a template 
and the two specially designed primers (XLR and XLS) amplify 
plasmids containing the gene of interest. Only plasmids 
containing priming sites for both XL-PCR primers and the gene of 
interest will be amplified during the XL-PCR reaction. 

Figure 4 is a representation of the amplified DNA segments 
which have been obtained through the XL-PCR reaction and 
consequently purified after separating the products on an agarose 
gel. For best results, the cDNA library used as a template should 
be synthesized by random priming to assure the availability in 
this step of different amplified length of DNA (3' end) between 
the XLS priming site and the T7 priming site in the vector. The 
length of the 5' end (between the XLR priming site and the T3 
priming site) in the vector will vary in size depending on how 
much of the mRNA of the gene of interest had been converted into 
cDNA during the cDNA library synthesis. 

Figure 5 shows how the purified DNA segments containing the 
plasmid and the gene of interest are religated to form a circular 
plasmid and transformed into bacteria for amplification. Here 
chemically competent E . coli cells were transformed and grown on 
petri dishes containing LB agar and 25 mg/L carbenicillin (2XCarb) 
for antibiotic selection. 

Figure 6 shows schematically how pure samples of clones were 



WO 96/38591 



PCT/US96/08501 



obtained from the different E. coli colonies grown in the 
procedure shown in Figure 5 (also Step 1 purification, Step 2 
religation and Step 3 transformation in Figure 6) . These clones 
are screened in Step 4 for additional sequence of the gene of 
5 interest at the 5' end. For this purpose the clones were analyzed 
by a PCR reaction employing the XLR primer and the T3 vector 
primer. The size of the resulting product will indicate how much 
additional sequence upstream of the XLR priming site each clone 
contains . 

10 Figures 7A through 7H show the results of the inventive 

method, in which a partial sequence from Incyte clone 14770, which 
was similar to heat shock protein 90, was successively sequenced 
to obtain a full-length cDNA. 

Figures 8A through 8F show the results of the inventive 
15 method, in which a partial sequence from Incyte clone 87058 which 
was similar to cathepsin was successively sequenced to obtain 
extensions of the cDNA. 

MODES FOR CARRYING OUT THE INVENTION 
Unless defined otherwise, all technical and scientific terms 
20 used herein have the same meaning as is commonly understood by one 
of skill in the art to which this invention belongs. All patents 
and publications referred to herein are incorporated by reference 
herein. 

Before the present compounds, variants, formulations and 
25 methods for making and using such are described, it is to be 

understood that this invention is not limited to the particular 
compounds, variants, formulations or methods described, as such 
enzymes, formulations and methodologies may, of course, vary. The 
terminology used herein is for the purpose of describing 
30 particular embodiments only and is not intended to be limiting 

since the scope of protection will be limited only by the appended 
claims. 

In the specification and appended claims, the singular forms 
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u a", "an" and "the" include plural referents unless the context 
clearly dictates otherwise. Thus f for example, reference to "a 
high-fidelity PCR enzyme" includes mixtures of such enzymes and 
any other enzymes fitting the stated criteria, reference to the 
method includes reference to one or more methods for obtaining 
full-length cDNA sequences which will be known to those skilled in 
the art or will become known to them upon reading this 
specification. 

The present method provides a way to utilize a genomic 
DNA library or a plasmid cDNA library (either obtained by cloning 
cDNAs directly into a plasmid vector or by converting a Lambda 
library into a plasmid library by known methods e.g. Lambda ZAP 
excision or Lambda ZIPLOCK conversion) which has been used for 
sequencing cDNAs, as a source to obtain much longer DNAs and in 
certain cases complete genes of partially known DNA sequences. 
The steps disclosed herein are based on cDNA libraries but equally 
apply to genomic DNA libraries. 

This new method utilizes PCR kits which enable the researcher 
to amplify long pieces of DNA. The XL-PCR amplification kit 
(Per kin-Elmer) was employed. However, equivalent products may be 
available from other major suppliers. This novel method allows one 
person to process multiple genes (up to 96 genes) at a time and 
obtain extended or complete sequence (possibly full-length) of the 
cDNAs of interest within 6-10 days. This compares very favorably 
with current competitive methods like screening with labelled 
probes which allow one worker to process only about 3-5 genes and 
obtain initial results in 14-40 days. This represents an increase 
in throughput of at least 1000%. 

This increased efficiency is possible because of the 
inventive combination of steps shown in the flow chart (Figure 1) . 
First, primer design and synthesis (based on a known partial 
sequence) can be performed in about two days. The PCR 
amplification can be performed in 6-8 hours. Multiple libraries 
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can be pooled and therefore screened at the same time. The next 
steps of purification and ligation take about one day. Then 
transformation and growing up the bacteria take one day. Then 
screening for clones with additional sequence of the genes of 
interest by PCR takes approximately five hours. The next steps of 
DNA preparation and sequencing of the selected clones can be 
performed in about one day. This totals 6-7 days. At the end of 
this time, one has usually obtained a much longer cDNA sequence, 
assuming such a longer cDNA existed in the libraries than what was 
initially sequenced. If the new sequence is a complete gene, then 
the goal has been reached. If the complete sequence has not been 
obtained, one still has a much longer sequence than before, and 
this longer sequence can be used to design primers to repeat the 
procedure on the same or another library. The choice of library 
is up to the researcher, but a preferred library is one that has 
been size-selected to include only larger cDNAs. 

This method presumes that one already has partial cDNA 
sequences, either from a publicly available database or the 
scientist' s own earlier research, including but not limited to 
earlier preparation of a cDNA library whose cDNAs have been 
partially sequenced. The cDNA library may have been prepared with 
oligo dT or random primers. The difference between oligo dT and 
randomly primed libraries is that a randomly primed library will 
have more sequences which contain 5' ends of cDNAs . A randomly 
primed library may be particularly useful for further work when 
the oligo dT library does not yield a complete gene. Random 
priming of the library also helps yield more cDNA sequences of 
different lengths. Library preparation techniques which promote 
longer insert sizes will in turn permit the sequencing of more 
complete cDNAs. Obviously, the larger the protein, the less 
likely it is that the complete cDNA will be found in a single 
plasmid. 

Figure 2 shows a typical plasmid containing a cDNA which had 

- 9 - 
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been partially sequenced from the 5' end with T3 as a primer. The 
top darkened portion represents the insert containing the gene of 
interest. 

Step 1 1 PCR-amplification of cDNA-clones containing the gene of 

5 interest 

The first step of this method requires the design of two 
primers based on the known sequence. The known sequence can be 
obtained by those skilled in the art either by a wet lab method or 
from the many publicly available DNA databases. One primer is 

10 synthesized to be extended in an antisense direction (XLR) and the 
other in the sense direction (XLS or XLF) . In effect, the primers 
are designed to anneal to either end of the known sequence and to 
be extended * outward'' from there to generate amplicons containing 
new, unknown sequences of the genes of interest. This is 

15 different from typical PCR, in which the primers are designed to 
amplify a known sequence in a direction "inward" toward each 
other . 

The primers need to be designed in a way displaying optimal 
criteria for extra long PCR. A program like Oligo 4.0s (National 
20 Biosciences, Inc., Plymouth MN) can be employed for this purpose. 
In general primers should be 22-30 nucleotides in length, consist 
of a GC content of 50% or more and anneal at 68°C-72°C to the 
target. Hairpin structures and primer-primer dimerizations must be 
avoided. 

25 Primers varying from the conditions described above may 

result in amplification of the desired targets providing extension 
conditions have been adjusted. 

Figure 3 shows the next step, in which a cDNA library is used 
as a template and the two primers (XLR and XLS) amplify plasmids 

30 containing the gene of interest. In this step, it is very helpful 

to use PCR enzymes which provide high fidelity and copy long 

sequences, such as that provided in the XL-PCR kit (Part No. 

N808-0182, Perkin Elmer, Applied Biosystems, Foster City, CA) . 

- 10 - 
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10 



Generally, kit instructions should be followed, including 

suggestions to optimize concentrations of various reagents. In 

the examples disclosed infra, 25pMol of each primer worked well. 

Template (plasmid library) concentrations can be varied (see 

Examples infra for details) . It is essential to thoroughly 

resuspend the enzyme in solution prior to use, especially if the 

solution has been stored at -20 - C. If the enzyme is not 

adequately resuspended, its effectiveness is impaired. The 

preferred system is setup initially in two layers, employing 

Ampliwax- PCR Gems. However, efficiency can be increased by 

avoiding the use of these Gems and initiating amplification by 

using the "hot-start" technique by adding Magnesium, which is 

essential for amplification, at 82* C. 

Although various cycling conditions are detailed in the 

15 examples infra , the following cycling conditions have been found 

to be optimal with the MJ PCT200 thermocycler (MJ Research, 

Watertown, MA) . Times and temperatures may be varied to optimize 

conditions in different thermocyclers ■ 

Step 1 94* for 60 sec (initial denaturation) 
20 Step 2 94° for 15 sec 
Step 3 65 * for 1 min 
Step 4 68* for 7 min 

Step 5 Repeat step 2-4 for 15 additional times 
Step 6 94* for 15 sec 
25 Step 7 65 ' for 1 min 

Step 8 68° for 7 min + 15 sec/cycle 

Step 9 Repeat step 6-8 for 11 additional times 

Step 10 72° for 8 min 

Step 11 4* for 0.00 sec (to hold at 4") 
30 At the end of these 28 cycles, 50 ^1 of the reaction mix is 

removed; on the remaining reaction mix, an additional 10 
additional cycles are run, as outlined below: 

Step 1 94* for 15 sec 
35 Step 2 65° for 1 min 

Step 3 68° for (10 min + 15 sec) /cycle 

Step 4 Repeat step 1-3 for 9 additional times 

Step 5 72' for 10 min 

- 11 - 
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Next a 5-10 |ll aliquot of the reaction mixture can be 
analyzed on a mini-gel to determine which reactions were 
successful. 

Step 2: — Purification Of amplicons containing the gene of interest 
Figure 4 is a graphical representation of the amplified cDNA 
segments which have been separated on an agarose gel. Note that 
there are a variety of lengths of cDNA. Although the rest of the 
method could be performed using all extended cDNA species, the 
method can proceed optionally after selecting the largest products 
(likeliest to provide the remainder of the full-length gene) . 
Some of the larger species may in fact be hybrid clones which 
contain two cDNA inserts as a result of malfunction during the 
cDNA library construction which may represent an incomplete 
digestion with the restriction enzyme at the end of the cDNA 
synthesis. Such amplified hybrid clones, also called chimera, 
could result in overlooking the correct targeted extensions. 

Successful reaction products should be purified on an agarose 
gel (preferentally low agarose concentrations 0.6-0.8% should be 
used) or other appropriate method. An appropriate volume of 
reaction mixture should be loaded to obtain good separation of the 
products and to separate them from the plasmid library (template) 
still in the reaction mixture. Contamination with the template 
cDNA library will result in transf ormants which don't contain the 
desired gene and will require an extensive screening of many 
colonies . The bands representing the genes of interest are then 
cut out of the gel and purified using a method like the QIAQuick 
gel extraction kit (Qiagen, Inc., Chatsworth, CA) . 

Step 3* Cloning of amol icons containing the gene of interest 

Eventual overhangs are converted into blunt ends to 
facilitate religation and cloning of the products. For this 
purpose, Klenow enzyme (3 units/reaction mixture) and dNTP's (0.2 
mM final concentration) are added and the reaction is incubated at 
room temperature for 30 min. The Klenow enzyme is then 
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inactivated by incubating the reaction at 75° for 15 min. 

The products are then ethanol precipitated and redissolved in 
13 \ll of ligation buffer containing 1 mM ATP. 1ml T4-DNA ligase 
(15 units) and T4 Polynucleotide kinase (5 units) are added and 
5 the reaction is incubated at room temperature for 2-3 hours or 
overnight at 16 *C. 

3jil of the ligation mixture are transformed into 40ml of 
competent E.coli cells (prepared with a standard protocol) . 80|Xl 
of SOC medium are added and after 1 hour of recovery of the cells 

10 at 37 - C the whole transformation mixture is plated on LB-agar 
2XCarb-containing petri plates. . 

Step 4; Screening of cloned prodnnt.s 

The next day 8 or 12 colonies are randomly picked from each 
plate and grown in individual wells of a sterile 96-well 

15 microtiter plate (e.g. 96 Well Cell Culture Cluster, Catalog No. 

3799, Costar Corp., Cambridge, MA 02140). Each well contains 150ml 
of LB/2XCarb medium. Thus, each row of the microtiter plate 
contains twelve clones from the same extension reaction. The 
cells are grown over night at 37 °C. 

20 The next day, 5 \il of these overnight cultures are tranferred 

into a non-sterile 96-well plate (Falcon 3911 Microtest III™, 
Flexible Assay Plate, Becton Dickinson, Oxnard, CA) and diluted 
1:10 with water. 5^11 of each dilution are then transferred into a 
PCR array (e.g., Cycleplate, Robbins Scientific Corp., Sunnyvale, 

25 CA) . To obtain a IX final concentration of PCR reagents, 15 \il of 
a 1.33X concentrated PCR mix are added to each well. Another way 
of efficient screening for extension products is the multiplex PCR 
method where multiple specific primers are pooled and submitted to 
the same reaction, therefore increasing the efficiency of setting 

30 up the screening mixtures. Addition of the PCR-template 

(individual cultures) has been improved by the use of a 96-pin 
tool with which an aliquot of all 96 cultures grown as described 
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above can be transferred into the PCR-screening mix in a matter of 
1-2 minutes . 

For PCR amplification, the final concentrations are IX for 
PCR mix, 5 flM of each of a vector primer and one or both of the 
5 gene specific primers used for the original extension reaction and 
0.75 units of Taq polymerase are added to each well. 
Amplification generally was performed using the following 
conditions : 

Step 1 94 "C for 60sec 
10 Step 2 94 °C for 20sec 
Step 3 55 *C for 30sec 
Step 4 72 *C for 90sec 

Step 5 repeat steps 2-4 for an additional 29 times 

Step 6 72 "C for 180sec 
15 Step 7 4'C for ever 

Aliquots of these PCR reactions are run on agarose gels 

together with molecular weight markers . The size of the resulting 

PCR products will allow direct determination of how much 

additional sequence the selected clones contain compared to the 
20 original partial cDNA. The efficiency of the method has been 

further improved by using the resulting PCR-products directly for 

sequencing thus avoiding the necessity of preparing plasmids. 

The appropriate clones are selected and grown for plasmid 

preparation and sequencing. 
25 Plasmid preparations are made with standard kits familiar to 

those skilled in the art. Examples include the PROMEGA Magic 

MINIPREP and the AGTC alkaline lysis kit. 

Sequencing is performed employing standard automated ABI 

sequencing equipment and protocols using either dye-primer or 
30 dye-terminator kits . 

Sequence processing and assemblage of the sequencing data are 

performed using standard ABI software, including INHERIT™ analysis 

and the Power assembler. 
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INDUSTRIAL APPLICABILITY 

Example 1 

For the initial method evaluation, a known gene was selected. 
A partial sequence of the human 90-kDa heat-shock protein gene 
5 (HUMHSP90, accession M16660) had been identified in a THP-1 

library. This partial sequence (Incyte clone T-014201) initiated 
at base 1127 of the sequence with accession number M16660. 
1.1 Primer design 

Two primers were designed to perform the method described in 
10 the invention. 

Primer 1 (XLR) 5' AGC TGT CCA TGA TGA ACA CAC G 3 1 
(1180-1159) 

Primer 2 (XLS) 5» AAT AGG CAC CAC ACC AAC TGA G 3' 
(2011-2032) 
15 1.2 Template preparation 

A THP-1 cDNA library constructed into the LambdaZAP vector 
(Stratagene) was converted into a plasmid library following the 
mass excision protocol. Plasmids of the excised libraries were 
prepared using the Quiagen Midi plasmid purification kit. 
20 1.3 XL-PCR reaction set-up 

The extension reactions were prepared following the 
instructions provided with the GeneAmp XL PCR Kit (Part No. 
N808-0182) from Perkin Elmer. A two layer system was set up as 
follows : 

25 The lower reagent mix was prepared by pipetting the following 

components into a 0.2ml MicroAmp reaction tube. 

Lower reagent mix preparation: ■ 
Water 13.6 ^1 

30 3.3X buffer 12.0 \il 

dATP (lOmM) 2.0 ill 

dCTP (lOmM) 2.0 ill 
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dGTP (lOmM) 2.0 ^ll 

dTTP (lOmM) 2.0 \il 

Primer XLS (SO^M) 1.0 ^ll 

Primer XLR (50\lM) 1.0 Jil 

5 Mg(0Ac)2 (25mM) 4.4 \il 



Total lower reagent mix 40.0 \il 

One AmpliWax™ gem was added to the tube. The wax was melted 
10 by incubating the reaction tubes at 75 "C for 5 minutes. Then the 
tubes were cooled down to 4*C. 

Upper reagent mix preparation: 
3.3X buffer 18.0 ml 

15 rTth DNA Polymerase 2.0 ml 



Total upper enzyme mix 20.0 Jll 

20 (il of the enzyme /buffer mix are added to each tube and 
20 kept separated from the lower mix by the wax layer. 
Addition of template: 

The template DNA (excised library) was diluted to an 
appropriate concentration in water and then added to the upper 
mix. Mixing of the components is not necessary. 

25 

Template (6.25ng/ml) 40.0 \il 



Final volume 100.0 jutl 

30 1.4 XI.-PCR amplification 

For amplification the following protocol was employed: 
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Step 


1 


94 s for 60 sec 


(initial denaturation) 


Step 


2 


94° for 15 sec 




Step 


3 


65" for 1 min 




Step 


4 


68* for 7 min 




Step 


5 


Repeat step 2-4 


for 15 additional times 


Step 


6 


94* for 15 sec 




Step 


7 


65" for 1 min 




Step 


8 


68° for 7 min + 


15 sec/cycle 


Step 


9 


Repeat step 6-8 


for 11 additional times 


Step 


10 


72* for 8 min 




Step 


11 


4° for 0.00 sec 


(to hold at 4") 



1.5 Purification of amplified products 

30 |Xl of the amplified products were run on a 0.7% agarose 
15 gel for 16 hours. Visible DNA bands were then cut out and purified 
using the QIAquick gel purification kit. 

1.6 Cloning of amplified products 

Klenow enzyme (3 units/reaction) and dNTP f s (0.2mM final 
concentration) were added and the reactions were incubated at room 

20 temperature for 30 min followed by incubation at 75° C for 15 min. 
The products were then ethanol precipitated and redissolved in 13 
|Ltl of ligation buffer containing ImM ATP. T4-DNA ligase (15 units) 
and T4 Polynucleotide kinase (5 units) were added, and the 
reaction was incubated at room temperature for 3 hours. 

25 3\il of the ligation mixture were transformed into 40 ml of 

competent E.coli cells. After heatshocking the cells at 42* C for 
45 seconds, 80 [ll of SOC medium were added, and the cells were 
allowed to recover at 37° C for 1 hour. The whole transformation 
mixture then was plated on LB-agar/2XCarb-containing petri dish 

30 plates. 

1.7 Screening of cloned products 

The next day 10 colonies were randomly picked and grown 
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overnight in Falcon 2059 tubes (Becton Dickinson, Oxnard, CA) . 
containing 3 ml of LB-broth with 2X Carb. 

5 pi of the cultures were diluted 1:10 with water and 5 ml of 
this dilution were transferred into MicroAmp™ PCR tubes (Perkin . 
5 Elmer, Applied Biosystems, Foster City, CA) . 

15 \il of a 1.33X concentrated PCR mix were added to each 

well . 

The 1.33 x concentrated PCR mix contained the following 
components : 

10 10X PCR-buffer 2.0 Jil 

2mM dNTPs 2.0 Jil 

M13 rev primer (O.OlmM) 1.0 ^1 

Primer 2 (XLR, O.OlmM) 1.0 \ll 

Taq Polymerase 0.15 Jil 

15 Water 8.85 \il 



Final Volume 15.0 |Jll 

The PCR cycling conditions were choosen as follows: 
Step 1 94° C for 60sec 
20 Step 2 94" C for 20sec 
Step 3 55* C for 30sec 
Step 4 72 * C for 90sec 

Step 5 repeat steps 2-4 for an additional 29 times 
Step 6 72* C for 180 sec 
25 Step 7 4* C for ever 

Aliquots of the amplified products were run on a 0.8% agarose 
gel in parallel with the 1 kb DNA ladder (Life Technologies, 
Gaithersburg, MD 20897) . Appropriate plasmids containing different 
size inserts were selected for sequencing analysis. 
30 1.8 Sequencing analyis of cloned products 

The DNA of the selected clones was prepared using the 
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WizardTM Minipreps DNA Purification System (Promega Corporation, 
Madison, WI) following the instructions of the manufacturer. 
Sequencing reactions were performed using the PRISMTM Ready 
Reaction DyeDeoxy Terminator Cycle Sequencing Kit (Part No 401628, 
5 Perkin Elmer, Applied Biosystems, Foster City, CA) . 
1.9 Analysis of sequenced products 

Three clones were selected for sequencing (14201.3, 14201.5, 
14201.13). The sequences obtained (SEQ ID NOS:3-5, respectively) 
were aligned using the DNASIS Multiple sequence alignment program. 

10 Clone 14201.3 initiated at base 24 of the published sequence 

(HUMHSP90), clone 14201.5 initiated at base 13 of the published 
sequence and clone 14201.13 initiated at base 538 of the published 
sequence, the original clone (14201) initiated at base 1127 of the 
published sequence. 

15 Figure 7A-7H shows an alignment of the obtained sequences 

with the published human Hsp 90 nucleotide sequence. Clones 
14201.3 and 14201.5 contain part of the 5' untranslated region and 
therefore the full coding region of the gene has been obtained. 
Example 2 

20 For further method evaluation, a second known gene was 

selected. A partial sequence from a liver library was found to be 
related to that of the human cathepsin B gene (accession LI 6510, 
HUMCATHB, SEQ ID NO: 6). This partial sequence (Incyte clone 
87058, SEQ ID NO: 7) initiated at base 1066 of the sequence with 

25 accession number L16510. 

2.1 Primer design 

Two primers were designed to perform the method described in 
the invention: 

Primer 1 (XLR) 5' AAG CCA TTG TCA CCC CAG TCA G 3' 
30 (1103-1082) 

Primer 2 (XLS) 5' GGT TCA CTG TGG AAT CGA ATC 3' 
(1125-1145) 

2.2 Template preparation 
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A liver cDNA library constructed into the LambdaZAP vector 
(Stratagene) was converted into a plasmid library following the 
mass excision protocol. Plasmids of the excised libraries were 
prepared using the Quiagen Midi plasmid purification kit. 
2.3 XL-PCR reaction set-up 

The extension reactions were prepared following the 
instructions provided with the GeneAmp XL PCR Kit (Part No. 
N808-0182) from Perkin Elmer. A two layer system was set up as 
described below. The lower reagent mix was prepared by pipetting 
the following components into a 0.2ml MicroAmp reaction tube. 
Lower reagent mix preparation: 



Water 




13.6 


Hi 


3.3 x buffer 




12.0 


Hi 


dATP 


(lOmM) 


2.0 


m 


dCTP 


(lOmM) 


2.0 


Hi 


dGTP 


(lOmM) 


2.0 


Hi 


dTTP 


(lOmM) 


2.0 


Hi 


Primer XLS 


(50HM) 


1.0 


Hi 


Primer XLR 


(50HM) 


1.0 


Hi 


Mg(OAc) 2 


(25J1M) 


4.4 


Hi 


Total lower 


reagent mix 


40.0 


Hi 



One AmpliWaxV. gem was added to the tube. This was melted by 
incubating the reaction tubes at 75 - C for 5 minutes. Then the 
tubes were cooled down to 4"C. 
Upper reagent mix preparation: 

3.3X buffer 18.0 \il 

rTth DNA Polymerase 2.0 |Xl 
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Total upper enzyme mix 20.0 \ll 

20 \il of the enzyme/buffer mix were added to each tube and 
kept separated from the lower mix by the wax layer. 
5 Addition of template: 

The template DNA (excised library) was diluted to an 
appropriate concentration in water and then added to the upper 
mix. Mixing of the components is not necessary. 
Template (6.25ng/p.l) 40.0 |ll 
10 ' 

Final volume 100.0 \il 

2.4 XL-PCR amplification 

For amplification the following protocol was employed: 





Step 


1 


94° for 60 sec 


(initial denaturation) 


15 


Step 


2 


94° for 15 sec 






Step 


3 


65" for 1 min 






Step 


4 


68 0 for 7 min 






Step 


5 


Repeat step 2-4 


for 15 additional times 




Step 


6 


94 * for 15 sec 




20 


Step 


7 


65* for 1 min 






Step 


8 


68* for 7 min + 


15 sec/cycle 




Step 


9 


Repeat step 6-8 


for 11 additional times 




Step 


10 


72* for 8 min 






Step 


11 


4* for 0.00 sec 


(to hold at 4°) 


25 


2.5 


Purification of amplified products 



30 \il of the amplified products were run on a 0.7% agarose 
gel for 16 hours. Visible DNA bands were then cut out and purified 
using the QIAQuick gel purification kit. 
2.6 Cloning of amplified products 
30 Klenow enzyme (3 units/reaction) and dNTP's (0.2mM final 

concentration) were added, and the reactions were incubated at 
room temperature for 30 min followed by incubation at 75 *C for 15 
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min. 

The products were then ethanol precipitated and redissolved in 13 
Hi of ligation buffer containing ImM ATP. T4-DNA ligase (15 units) 
and T4 Polynucleotide kinase (5 units) were added, and the 
5 reaction was incubated at room temperature for 3 hours. 

3 \ll of the ligation mixture were transformed into 40 (il of 
competent E.coli cells. After heatshocking the cells at 42 °C for 
45 seconds, 80 ill of SOC medium were added; and the cells were 
allowed to recover at 37o C for 1 hour. The whole transformation 
10 mixture then was plated on LB-agar 2x Carb-containing petri 
dishes. 

2.7 Screening of cloned products 

The next day 10 colonies were randomly picked and grown 
overnight in Falcon 2059 tubes (Becton Dickinson, Oxnard, CA 
15 93030) containing 3 ml of LB-broth with 2X Carb. 

5 jil of the cultures were diluted 1:10 with water and 5 \il of 
this dilution were transferred into MicroAmpTM PCR tubes (Per kin 
Elmer, Applied Biosystems, Foster City, CA) . 

15 |il of a 1.33 x concentrated PCR mix were added to each 

20 tube. 

The 1.33 x concentrated PCR mix contained the following 
components : 

10 x PCR-buffer 2.0 \il 

2mM dNTPs 2 . 0 \il 

25 Ml 3 rev primer (O.OlmM) 1.0 |Xl 

Primer 2 (XLR, O.OlmM) 1.0 \il 

Taq Polymerase 0.15 ^1 

water 8.85 |il 



30 Final Volume 15.0 fll 

The PCR cycling conditions were as follows: 
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Step 1 94 'C for 60sec 
Step 2 94 'C for 20sec 
Step 3 55 *C for 30sec 
Step 4 72 - C for 90sec 

Step 5 repeat steps 2-4 for an additional 29 times 
Step 6 72 'C for 180sec 
Step 7 4 # C for ever 

Aliquots of the amplified products were run on a 0.8% agarose 
gel in parallel with the lkb DNA ladder (Life Technologies, 
Gaithersburg, MD 20897) . Appropriate clones containing different 
size inserts were selected for sequencing analysis. 

2.8 Sequencing analyis of cloned products 

The DNA of the selected clones was prepared using the 
WizardTM Minipreps DNA Purification System (Promega Corporation, 
Madison, WI) following the instructions of the manufacturer. 
Sequencing reactions were performed using the PRISMTM Ready 
Reaction DyeDeoxy Terminator Cycle Sequencing Kit (Part No 401628, 
Perkin Elmer, Applied Biosystems, Foster City, CA) . 

2.9 Analysis of sequenced products 

Three clones were selected for sequencing (87058.6, 87058.8, 
87058.16). The sequences obtained (SEQ IDNOS:8-10, respectively) 
were aligned using the DNASIS Multiple sequence alignment program 
and are shown in Figures 8A through 8F. Clone 87058.6 initiated 
at base 644 of the published sequence (HUMCATHB, SEQ ID NO: 6), 
clone 87058.8 initiated at base 353 of the published sequence and 
clone 87058.16 initiated at base 58 of the published sequence, the 
original clone (87058, SEQ ID NO:7) initiated at base 1058 of the 
published sequence. 

Figures 8A through 8F show an alignment of the obtained 
sequences with the published human Hsp 90 nucleotide sequence. 
Clone 87058.16 contains part of the 5'UT and therefore the full 
coding region of the gene. 
Example 3 
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In Example 3, a full length ,cDNA (Seq ID NO 11) of a novel 
P2U purinergic receptor homolog was obtained by the inventive 
method and is the subject of U.S. Patent Application 08/459,046 
filed June 2, 1995, which is hereby incorporated by reference. 
5 Inherit™ and BLAST search and alignment tools were used to relate 
a partial sequence found in Incyte Clone 179696 from the placental 
cDNA library to the GenBank sequence of RNU09402, a G-protein 
coupled surface receptor from rat (Rice WR et al (1995) Am J 
Respir Cell Molec Biol 12:27-32). 

10 The cDNA of Incyte 17 9696 was extended to full length using a 

modified XL-PCR (Perkin Elmer) procedure. Primers were designed 
based on known sequence; one primer was synthesized to initiate 
extension in the antisense direction (XLR) and the other to extend 
sequence in the sense direction (XLF) . The primers allowed the 

15 sequence to be extended * outward" from the known sequence, thus 
generating amplicons containing new, unknown nucleotide sequence 
comprising the gene of interest. The primers were designed using 
Oligo 4.0 (National Biosciences Inc, Plymouth MN) to be 22-30 
nucleotides in length, to have a GC content of 50% or more, and to 

20 anneal to the target sequence at temperatures about 68° -72° C. 
Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 

The cDNA library was used as a template, and XLR (bases 
278-298) and XLF (bases 587-610) primers were used to extend and 

25 amplify the 179696 sequence. By following the instructions for 
the XL-PCR kit and thoroughly mixing the enzyme, high fidelity 
amplification is obtained. Beginning with 25 pMol of each primer 
and the recommended concentrations of all other components of the 
kit, PCR was performed using the MJ PTC200 thermocycler (MJ 

30 Research, Watertown MA) and the following parameters: 
Step 1 94" C for 60 sec (initial denaturation) 

Step 2 94° C for 15 sec 

Step 3 65° C for 1 min 
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10 



15 



Step 4 68 • C for 7 min 

Step 5 Repeat step 2-4 for 15 additional cycles 

Step 6 94 # C for 15 sec 

Step 7 65° C for 1 min 

Step 8 68° C for 7 min + 15 sec/cycle 

Step 9 Repeat step 6-8 for 11 additional cycles 

Step 10 72° C for 8 min 

Step 11 4° C (and holding) 

At the end of 28 cycles, 50 |J.l of the reaction mix was 
removed; and the remaining reaction mix was run for an additional 
10 cycles as outlined below: 
Step 1 94' C for 15 sec' 

Step 2 65° C for 1 min 

Step 3 68* C for (10 min + 15 sec) /cycle 

Step 4 Repeat step 1-3 for 9 additional cycles 

Step 5 72° C for 10 min 

A 5-10 \il aliquot of the reaction mixture was analyzed by 
electrophoresis on a low concentration (about 0.6-0.8%) agarose 
mini-gel to determine which reactions were successful in extending 
20 the sequence. Although all extensions potentally contain a full 
length gene, some of the largest products or bands were selected 
and cut out of the gel. Further purification involved using a 
commercial gel extraction method such as QIAQuick™ (QIAGEN Inc, 
Chatsworth CA) . After recovery^of the DNA, Klenow enzyme was used 
25 to trim single-stranded, nucleotide overhangs creating blunt ends 
which facilitated religation and cloning. 

After ethanol precipitation, the products were redissolved in 
13 |ll of ligation buffer. Then, T4-DNA ligase (15 units) and 

ljll T4 polynucleotide kinase were added, and the mixture was 
incubated at room temperature for 2-3 hours or overnight at 16° C. 
Competent E. coli cells (in 40 [ll of appropriate media) were 
transformed with 3 |ll of ligation mixture and cultured in 80 |il of 
SOC medium (Sambrook J et al, supra) . After incubation for one 
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hour at 37' C, the whole transformation mixture was plated on 
Luria Broth (LB) -agar (Sambrook J et al, supra) containing 
carbenicillin at 25 mg/L. The following day, 12 colonies were 
randomly picked from each plate and cultured in 150 \il of liquid 
5 LB/carbenicillin medium placed in an individual well of an 

appropriate, commercially-available, sterile 96-well microtiter 
plate. The following day, 5 \il of each overnight culture was 
transferred into a non-sterile 96-well plate and after dilution 
1:10 with water, 5 \ll of each sample was transferred into a PCR 
10 array. 

For PCR amplification, 15 \il of concentrated PCR reaction mix 
(1.33X) containing 0.75 units of Taq polymerase, a vector primer 
and one or both of the gene specific primers used for the 
extension reaction were added to each well. Amplification was 
15 performed using the following conditions: 



Step 1 94* C for 60 sec 

Step 2 94* C for 20 sec 

Step 3 55" C for 30 sec 

Step 4 72° C for 90 sec 

20 Step 5 Repeat steps 2-4 for an additional 29 cycles 

Step 6 72* C for 180 sec 

Step 7 4" C (and holding) 



Aliquots of the PCR reactions were run on agarose gels 
together with molecular weight markers. The sizes of the PCR 
25 products were compared to the original partial cDNAs, and 
appropriate clones were selected, ligated into plasmid and 
sequenced. 
Example 4 

In this example, the inventive method was used to obtain a 
30 novel full length cDNA from the partial sequence found in Incyte 
clone 08118 which was found to be somewhat homologous to the 
GenBank sequence of C5a anaphylatoxin receptor, a G-protein 
coupled surface receptor from dog (Perret J et al (1995) Biochem 
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J 288:911-17). Based on the partial cDNA sequence, primers (XLR 
= GAAAG AC AGC C AC C AC C AC C AC G and XLF = AGAAAGCAAGGCAGTCCATTCAGG ) 
were designed. Essentially the same method outlined in Example 3 
above was used to extend the partial sequence of 8118 to obtain 
5 the full length sequence (Seq ID NO: 12) of a novel C5a-like 

receptor homolog which is the subject of a U.S. Patent Application 
08/462,355 filed June 5, 1995, and whose disclosure is 
incorporated by reference. 

While the present invention has been described with reference 

10 to specific enzymes and sequences, particularly PCR enzyme, and 
formulations containing such, those skilled in the art understand 
that various changes may be made and equivalents may be 
substituted without departing from the true spirit and scope of 
the invention. In addition, many modifications may be made to 

15 adapt a particular situation, material, enzyme, process, process 
step or steps and still carry out the objective, spirit and scope 
of the invention. All such modifications are intended to be 
within the scope of the claims appended hereto. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: INCYTE PHARMACEUTICALS , INC. 

(ii) TITLE OF INVENTION: IMPROVED METHOD FOR OBTAINING 

FULL LENGTH cDNA SEQUENCES 

(iii) NUMBER OF SEQUENCES: 12 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

(B) STREET: 3330 Hillview Avenue 

(C) CITY: Palo Alto 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94304 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: To Be Assigned 

(B) FILING DATE: Filed Herewith 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/487,112 

(B) FILING DATE: 7-JUN-1995 

(Vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/462,355 

(B) FILING DATE : 5-JUN-1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/459,046 

(B) FILING DATE: 2-JUN-1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 08/566,334 

(B) FILING DATE: l-DEC-1995 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION SERIAL NO: US 60/006,809 

(B) FILING DATE: 15-NOV-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Luther, Barbara J. 

(B) REGISTRATION NUMBER: 33954 

(C) REFERENCE/DOCKET NUMBER: HP- 001-1 PCT 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: 415-855-0555 
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(B) TELEFAX: 415-852-0195 
(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2543 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

Ui) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank HUMHSP90 

(B) CLONE: Accession No. M16660 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



CTCCGGCGCA 


GTGTTGGGAC 


TGTCTGGGTA TCGGAAAGCA AGCCTACGTT GCTCACTATT 


60 


ACGTATAATC 


CTTTTCTTTT 


CAAGATGCCT GAGGAAGTGC ACCATGGAGA GGAGGAGGTG 


120 


GAGACTTTTG 


CCTTTCAGGC 


AGAAATTGCC CAACTCATGT CCCTCATCAT CAATACCTTC 


180 


TATTCCAACA AGGAGATTTT 


CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC 


240 


AAGATTCGCT 


ATGAGAGCCT 


GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 


300 


ATTGACATCA 


TCCCCAACCC 


TCAGGAACGT ACCCTGACTT TGGTAGACAC AGGCATTGGC 


360 


ATGACCAAAG 


CTGATCTCAT 


AAATAATTTG GGAACCATTG CCAAGTCTGG TACTAAAGCA 


420 


TTCATGGAGG 


CTCTTCAGGC 


TGGTGCAGAC ATCTCCATGA TTGGGCAGTT TGGTGTTGGC 


480 


TTTTATTCTG 


CCTACTTGGT 


GGCAGAGAAA GTGGTTGTGA TCAGAAAGCA CAACGATGAT 


540 


GAACAGTATG 


CTTGGGAGTC 


TTCTGCTGGA GGTTCCTTCA CTGTGCGTGC TGACCATGGT 


600 


GAGCCCATTG 


GCATGGGTAC 


CAAAGTGATC CTCCATCTTA AAGAAGATCA GACAGAGTAC 


660 


CTAGAAGAGA 


GGCGGGTCAA 


AGAAGTAGTG AAGAAGCATT CTCAGTTCAT AGGCTATCCC 


720 


ATCACCCTTT 


ATTTGGAGAA 


GGAACGAGAG AAGGAAATTA GTGATGATGA GGCAGAGGAA 


780 


GAGAAAGGTG 


AGAAAGAAGA 


GGAAGATAAA GATGATGAAG AAAAGCCCAA GATCGAAGAT 


840 


GTGGGTTCAG 


ATGAGGAGGA 


TGACAGCGGT AAGGATAAGA AGAAGAAAAC TAAGAAGATC 


900 


AAAGAGAAAT 


ACATTGATCA 


GGAAGAACTA AACAAGACCA AGCCTATTTG GACCAGAAAC 


960 


CCTGATGACA 


TCACCCAAGA 


GGAGTATGGA GAATTCTACA AGAGCCTCAC TAATGACTGG 


1020 


GAAGACCACT 


TGGCAGTCAA 


GCACTTTTCT GTAGAAGGTC AGTTGGAATT CAGGGCATTG 


1080 


CTATTTATTC 


CTCGTCGGGC 


TCCCTTTGAC CTTTTTGAGA ACAAGAAGAA AAAGAACAAC 


1140 


ATCAAACTCT 


ATGTCCGCCG 


TGTGTTCATC ATGGACAGCT GTGATGAGTT GATACCAGAG 


1200 
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TATCTCAATT TTATCCGTGG TGTGGTTGAC TCTGAGGATC TGCCCCTGAA CATCTCCCGA 1260 

GAAATGCTCC AGCAGAGCAA AATCTTGAAA GTCATTCGCA AAAACATTGT TAAGAAGTGC 1320 

CTTGAGCTCT TCTCTGAGCT GGCAGAAGAC AAGGAGAATT ACAAGAAATT CTATGAGGCA 1380 

TTCTCTAAAA ATCTCAAGCT TGGAATCCAC GAAGACTCCA CTAACCGCCG CCGCCTGTCT 1440 

GAGCTGCTGC GCTATCATAC CTCCCAGTCT GGAGATGAGA TGACATCTCT GTCAGAGTAT 1500 

GTTTCTCGCA TGAAGGAGAC ACAGAAGTCC ATCTATTACA TCACTGGTGA GAGCAAAGAG 1560 

CAGGTGGCCA ACTCAGCTTT TGTGGAGCGA GTGCGGAAAC GGGGCTTCGA GGTGGTATAT 1620 

ATGACCGAGC CCATTGACGA GTACTGTGTG CAGCAGCTCA AGGAATTTGA TGGGAAGAGC 1680 

CTGGTCTCAG TTACCAAGGA GGGTCTGGAG CTGCCTGAGG ATGAGGAGGA GAAGAAGAAG 1740 

ATGGAAGAGA GCAAGGCAAA GTTTGAGAAC CTCTGCAAGC TCATGAAAGA AATCTTAGAT 1800 

AAGAAGGTTG AGAAGGTGAC AATCTCCAAT AGACTTGTGT CTTCACCTTG CTGCATTGTG 1860 

ACCAGCACCT ACGGCTGGAC AGCCAATATG GAGCGGATCA TGAAAGCCCA GGCACTTCGG 1920 

GACAACTCCA CCATGGGCTA TATGATGGCC AAAAAGCACC TGGAGATCAA CCCTGACCAC 1980 

CCCATTGTGG AGACGCTGCG GCAGAAGGCT GAGGCCGACA AGAATGATAA GGCAGTTAAG 2040 

GACCTGGTGG TGCTGCTGTT TGAAACCGCC CTGCTATCTT CTGGCTTTTC CCTTGAGGAT 2100 

CCCCAGACCC ACTCCAACCG CATCTATCGC ATGATCAAGC TAGGTCTAGG TATTGATGAA 2160 

GATGAAGTGG CAGCAGAGGA ACCCAATGCT GCAGTTCCTG ATGAGATCCC CCCTCTCGAG 2220 

GGCGATGAGG ATGCGTCTCG CATGGAAGAA GTCGATTAGG TTAGGAGTTC ATAGTTGGAA 2280 

AACTTGTGCC CTTGTATAGT GTCCCCATGG GCTCCCACTG CAGCCTCGAG TGCCCCTGTC 234 0 

CCACCTGGCT CCCCCTGCTG GTGTCTAGTG TTTTTTTCCC TCTCCTGTCC TTGTGTTGAA 2400 

GGCAGTAAAC TAAGGGTGTC AAGCCCCATT CCCTCTCTAC TCTTGACAGC AGGATTGGAT 2460 

GTTGTGTATT GTGGTTTATT TTATTTTCTT CATTTTGTTC TGAAATTAAA GTATGCAAAA 2520 

TAAAGAATAT GCCGTTTTTA TAC 2543 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

{A) LENGTH : 261 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 



AAGAAAAAGA ACAACATCAA ACTCTATGTC CGCCGTGTGT TCATCATGGC AGCTGTGATG 



60 



AGTTGATACC AGAGTATCTC AATTTTATCC GTGGTGTGGT TGACTTGAGG TCTGCCCCTG 



120 



AACATCTCCC GGAAATGCTC CAGCAGAGCA AAATCTTGAA AGGCATTCGC AAAAACATTG 



180 



TTAAGAGTGC CTTAGCTCTT CTCTAGCTGG CAGAAGCAAG GGGATTTCAA GAAATTCTTT 



240 



TGGGGGGATT TCTTAAAAAT T 261 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 478 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201.3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

GCTGGGTATC GGAAAGCAAG CCTACGTTGC TCACTATTAC GTATAATCCT TTTCTTCAAG 60 

ATGCCTGAGG AAGTGCACCA TGGAGAGGAG GAGGTGGAGA CTTTTGCCTT TCAGGCAGAA 120 

ATTGCCCAAC TCATGTCCCT CATCATCAAT ACCTCCTATT CCAACAAGGA GATTTCCTCG 180 

GGAGTTGATC TCTAATGCTT CTGATGCCTC GGACAAGATT CGCTATGAAG CCTGACAGAC 240 

CCTTCGAAGT GGTCAGCGGC AAGAGCTGAA AATTGACATC ATCCCCAACC CTCAGGAACG 300 

TCCCTGTACT TTGGGTAGAC ACAGGCATTG GCATAAACAA AGCTGACCTC ATATTATTCG 360 

GGGAACCATT GCCAAGTCTT GTCTAAAAGC ATTCATGGAG GCTCTCAGGT TGGCGCAGAC 420 

ATCTCCAGAT TGGCAGGTGG GTGTTGGCTT TATTCTGCCC ACTTGGTGGC AGAGAAAT 478 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 508 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: CDNA 
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(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 
<B) CLONE: 14201.5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

GTTGGGACTG TCTGGGTATC GGAAAGCAAG CCTACGTTGC TCACTATTAC GTATAATCCT 60 

TTTCTTTTCA AGATGCCTGA GGAAGTGCAC CATGGAGAGG AGGAGGTGGA GACTTTTGCC 120 

TTTCAGGCAG AAATTGCCCA ACTCATGTCC CTCATCATCA ATACCTCCTA TTCCAACAAG 180 

GAGATTTTCC TTCGGGAGTT GATCTCTAAT GCTTCTGATG CCTTGGACAA GATTCGCTAT 240 

GAGAGCCTGA CAGACCCTTC GAAGTTGGAC AGTGGTAAAG AGCTGAAAAT TGACATCATC 300 

CCCAACCCTC AGGAACGTAC CCTGACTTTG GGTAGACACA GGCATCGGCA TGACCAAAAG 360 

CTGATCTCAT AATAATTGGG AACCATTGCA AGTCTGGTAC TAAAGCATTC ATGGAGGCTC 420 

TTCAGGCTGG TGCAGACATC TCCATGATTG GGCAGCTTGG GTGTTGCTTT ATTCTGCCTC 480 

CTTGGTGGCA GAGAAAGTGT TGTGATCA 508 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 547 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: THP-1 

(B) CLONE: 14201.13 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

TTGAGAGTAT GTCGAGTTAC TGTGGAGGTT CCTTCACTGC GTGCTGACAT GGTGAGCCCA 60 

TGGGAGCGGT ACCAAGTGAT CCTCCATCTC AAAGAAGATC AGACAGAGTA CCTAGAGAGA 120 

GGCGGATCAA AGAGTAGTGA TGAGCATCCT CAGATCATAG GCTATCCCAT CACCCTTTTT 180 

TGGAGAAGGA CGAGAGAAGG AATTAGGATG ATGAGGCAGA GGAAGAGAAT GGTGAGAATG 240 

AAGAGGAGTA ACGATGATGA AGAAACCCCA AGATCGATGA TGTGGTTCAG ATGAGGGGAT 300 

GACAGCGGTA GATAAGAAGA AGAAACTAGA ATCATCGGAT CATGACAGGA AGAACTAACA 360 

GATCATCTTT CGGCCAGAAT CCCTGATGTC ATCACCCAAG AGGGTATGGA GATTTCTACA 420 

TGCAGCTCAC TTTACTGGGC AAGACACTTG GCAGCAACAC TTTTCTGTAG AAGGCCATTG 480 
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CATCACGCAT TGCTATTCTT CCCTCGCCGT CTCCTTTGAC CTGGTCTGGC ATCATGGTGT 540 
CTTGATC 547 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1996 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: GenBank HUMCATHB 

(B) CLONE: Accession No. L16510 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 

TCCGGCAACG CCAACCGCTC CGCTGCGCGC AGGCTGGGCT GCAGGCTCTC GGCTGCAGCG 60 

CTGGGCTGGT GTGCAGTGGT GCGACCACGG CTCACGGCAG CCTCAGCCAC CCAGATGTAA 120 

GCGATCTGGT TCCCACCTCA GCCTCCCGAG TAGTGGATCT AGGATCCGGC TTCCAACATG 180 

TGGCAGCTCT GGGCCTCCCT CTGCTGCCTG CTGGTGTTGG CCAATGCCCG GAGCAGGCCC 240 

TCTTTCCATC CCCTGTCGGA TGAGCTGGTC AACTATGTCA ACAAACGGAA TACCACGTGG 300 

CAGGCCGGGC ACAACTTCTA CAACGTGGAC ATGAGCTACT TGAAGAGGCT ATGTGGTACC 360 

TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA CCGAGGACCT GAAGCTGCCT 420 

GCAAGCTTCG ATGCACGGGA ACAATGGCCA CAGTGTCCCA CCATCAAAGA GATCAGAGAC 480 

CAGGGCTCCT GTGGCTCCTG CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC 540 

TGCATCCACA CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CGGAGGACCT GCTCACATGC 600 

TGTGGCAGCA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC TTGGAACTTC 660 

TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT CCCATGTAGG GTGCAGACCG 720 

TACTCCATCC CTCCCTGTGA GCACCACGTC AACGGCTCCC GGCCCCCATG CACGGGGGAG 780 

GGAGATACCC CCAAGTGTAG CAAGATCTGT GAGCCTGGCT ACAGCCCGAC CTACAAACAG 840 

GACAAGCACT ACGGATACAA TTCCTACAGC GTCTCCAATA GCGAGAAGGA CATCATGGCC 900 

GAGATCTACA AAAACGGCCC CGTGGAGGGA GCTTTCTCTG TGTATTCGGA CTTCCTGCTC 960 

TACAAGTCAG GAGTGTACCA ACACGTCACC GGAGAGATGA TGGGTGGCCA TGCCATCCGC 1020 

ATCCTGGGCT GGGGAGTGGA GAATGGCACA CCCTACTGGC TGGTTGCCAA CTCCTGGAAC 1080 

ACTGACTGGG GTGACAATGG CTTCTTTAAA ATACTCAGAG GACAGGATCA CTGTGGAATC 1140 
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ACCGATCAGT ACTGGGAAAA GATCTAATCT 1200 

CGAGATCGGG GTAGAAATGC ATTTTATTCT 1260 

AGGGTCTGAA GGACTGGATT GGCCAAACAT 1320 

GGCTACATCC CAGCCTGTGG TTACAGTGCA 1380 

CAGAGCGTCC TTCCCCCTGT AGACTAGTGC 1440 

GCCCCCTCCG TGATCCATCC ATCTCCAGGG 1500 

AGTTCCTAAC AGGATGAAAG TTCCCCCATC 1560 

TTCCACATTT GTCACAGAAA TCAGAGGAGA 1620 

AGTCTCCAGG TCCCCCTGCA TCTATCGAGT 1680 

CTCAGCATGA TTCTTTAATA GAAGTTTTAT 1740 

AGCCAGTGGA ACAGCGGGAG CCTGTGCTGG 1800 

AAAAGGAAAC CAAGTGGTCA GGAGTTGTTT 1860 

AAATAGTTTA GGAGAAACCA GCTTTTACTG 1920 

AGTTAACAAG GAATGCCTGT GCCAATAAAA 1980 

1996 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: LIVER 

(B) CLONE: 87058 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGGCACGAGC CAACTCCTGG AACACTGACT GGGGTGACAA TGGCTTCTTT AAAATACTCA 60 

GAGGACAGGT TCACTGTGGA ATCGAATCAG AAGTGGTGGC TGGAATTCCA CGCACCGTTC 120 

AGTACTGGGA AAAGTCTAAT CTGCCGTGGG CCTTCGTGCC AGTCCTGGGG GCGAGATGGG 180 

GGTAGAAATG CATTTTATTC TTTAAGTTCA CGTAAGATAC AAGTTTCAGA CAGGGGTCTA 240 

AGGCCTGGTT GCCAAAATCA GACCTGTTTT TCAAGGGGCC CAAGTCCTGG GTTC 294 
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GAATCAGAAG TGGTGGCTGG AATTCCACGC 
GCCGTGGGCC TGTCGTGCCA GTCCTGGGGG 
TTAAGTTCAC GTAAGATACA AGTTTCAGGC 
CAGACCTGTC TTCCAAGGAG ACCAAGTCCT 
GACAGGCCAT GTGAGCCACC GCTGCCAGCA 
CGTGGGAGTA CCTGCTGCCC AGCTGCTGTG 
AGCAAGACAG AGACGCAGGA TGGAAAGCGG 
AGTTCCCCCA GTACCTCCAA GCAAGTAGCT 
GATGGTGTTG GGAGCCCTTT GGAGAACGCC 
TTGCAATGTC ACAACCTCTC TGATCTTGTG 
TTTTCGTGCA CTCTGCTAAT CATGTGGGTG 
TTTGCAGATT GCCTCCTAAT GACGCGGCTC 
CTGACCCACT GATCTCTACT ACCACAAGGA 
TTTTTGAAAA ATTACAGCTT CACCCTGTCA 
GGTTTCTCCA ACTTGA 

(2) INFORMATION FOR SEQ ID NO: 7: 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 552 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.6 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



GTGAAGCTTG 


GAACTTCTGG 


ACAAGAAAAG 


GCCTGGTTTC TGGTGGCCTC 


TATGAATCCC 


60 


ATGTAGGGTG 


CAGACCGTAC 


TCCATCCCTC 


CCTGTGAGCA CCACGTCAAC 


GGCTCCCGGC 


120 


CCCCATGCAC 


GGGGGAGGGA 


GATACCCCCA 


AGTGTAGCAA GATCTGTGAG 


CCTGGCTACA 


180 


GCCCGACCTA 


CAAACAGGAC 


AAGCACTACG 


GATACAATTC CTACAGCGTC 


TCCAATAGCG 


240 


AGAAGGACAT 


CATGGCCGAG 


ATCTACAAAA 


ACGGCCCCGT GGAGGGAGCT 


TTCTCTGTGT 


300 


ATTCGGACTT 


CCTGCTCTAC 


AAGTCAGGAG 


TGTACCAACA CGTCACCGGA GAGATGATGG 


360 


GTGGCCATGC 


CATCCGCATC 


CTGGGCTGGG 


GAGTGGAGAA TGGCACAACC 


TACTGGCTGG 


420 


TTGGCAACTC 


CTGGAACACT 


GACTGGGGTG 


ACAATGGGTT CACTGTGGAA 


TCGAATCAGA 


480 


AGTGGTGGTG 


GAATTCCACG 


CACGATCAAG 


TGCTGGGAAA AGATCTTAAT CTGCCGGGGC 


540 


TGTCGGCCAG 


TC 








552 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 559 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(Vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.8 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GAGGTACCTT CCTGGGTGGG CCCAAGCCAC CCCAGAGAGT TATGTTTACC GAGGACCTGA 
AGCTGCCTGC AAGCTTCGAT GCACGGGAAC AATGGCCACA GTGTCCCACC ATCAAAGAGA 
TCAGAGACCA GGGTCCTGTG GCTCCTGCTG GGCCTTCGGG GCTGTGGAAG CCATCTCTGA 
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CCGGATCTGA TCCACACCAA TGCGCACGTC AGCGTGGAGG TGTCGGCGGA GGACTGCTCA 240 

CATGCTGTGG CAGATGTGTG GGGACGGCTG TAATGGTGGC TATCCTGCTG AAGCTTGGAC 300 

TTCTGGACAA GAAAAGGCCC TGGTTTCTGG TGGCCTCTAT GATCCCATGT AGGGTGTAGA 360 

CCGTACTCCA TCCCTCCCTG TGAAGCACCA CGTCAACGGT TCCCGGGCCC CATGCACGGG 420 

GAGGGAGATA CCCCCAAGTG TAACAAGATC TGTGAGCCTG GGTACAGTCC CGACCACAAA 480 

CAGGAAAAGC ACTACGGATA CAATTCCTCA GGTCTCCAAT AGTGAGAAGG GACATCATGC 540 

CGAGATCTAC AATAACGGC 559 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Liver 

(B) CLONE: 87058.16 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

CGGTTGAGAT TCGGACAGTC CGAAAACGTC CGGCAAGTCA CCCGCTCCGC TGGCGCAGGC 60 

TGGGTGCAGG CTCTCGGTGC AGGCTGGGTG GATCTAGGAT CCGGCTTCCA ACATGTGGCA 120 

GTTCTGGGCC TCCCTCTGTG CCTGCTGGTG TTGGACAATG CCCGGAGGAG GCCTCTTTCC 180 

ATCCCCTGTC GGATGAGCTG GTCACTATGT CAACAAACGG AATACCACGT GGAGGCCGGG 240 

AACAACTTCT ACAACGTGGA CATGAGCTAC TTGAGAGGTA TGTGGTACCT TCCTGGGTGG 300 

GCCCAAGCCA CCCCAGAGAG TTTGTTTACC GAGGACCTGA GCTGCCTGCA AGCTTCGAAG 360 

GACGGGAACA ATGGCCACAG TGTCCCACCA TCAAAGAGAT CAGAGACAGG GCTCCTGTGG 420 

TCCTGCTGGG CCTCCGGGGC TGTGGAAGCA TCTCTGACCG GATCTGCATC CACACCAATG 480 

GCACGTCAGC GTGGTGGTGT CGGGGAGGAC CTGATCACCT TTGTGGTAGC ATGTGTGGGG 540 

GACGGCTGTA ATGGTGGTTA TCCTGTGAAG CTGGGCCTTC TAGAAAGAAA AGGCTGTTTT 600 

GGTGGCCTTA TGACTCCCAT GT 622 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 984 base pairs 
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(B) TYPE: nucleic acid 
<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: Placenta 

(B) CLONE: 179696 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATGGAATGGG 


ACAATGGCAC 


AGACCAGGCT 


CTGGGCTTGC CACCCACCAC CTGTGTCTAC 


60 


CGCGAGAACT 


TCAAGCAACT 


GCTGCTCCCA 


CCTGTGTATT CGGCGGTGCT GGCGCCTGCC 


120 


CTCCCGCTGA 


ACATCTGTGT 


CATTACCCAG ATCTGCACGT CCCGCCGGGC CCTGACCCGC 


180 


ACGGCCGTGT 


ACACCCTAAA 


CCTTGCTCTG 


CCTGACCTGC TATATGCCTG CTCCCTGCCC 


240 


CTGCTCATCT 


ACAACTATGC 


CCAAGGTGAT 


CACTGGCCCT TTGGCGACTT CGCCTGCCGC 


300 


CTGGTCCGCT 


TCCTCTTCTA 


TGCCAACCTG 


CACGGGAGGA TCCTCTTCCT CACCTGCATC 


360 


AGCTTCCAGC 


GCTACCTGGG 


CATCTGCCAC 


CCGCTGGCCC CCTGGCACAA ACGTGGGGGC 


420 


CGCCGGGCTG 


CCTGGCTAGT 


GTGTGTAGCC 


GTGTGGCTGG CCGTGACAAC CCAGTGCCTG 


480 


CCCACAGCCA 


TCTTCGCTGC 


CACAGGCATC 


CAGCGTAACC GCACTGTCTG TTATGACCTC 


540 


AGCCCGCCTG 


CCCTGGCCAC 


CCACTATATG 


CCCTATGGGA TGGCTCTCAC TGTCATCGGC 


600 


TTCCTGCTGC 


CCTTTGCTGC 


CCTGCTGGCC 


TGCTACTGTC TCCTGGCCTG CCGCCTGTGC 


660 


CGCCAGGATG 


GCCCGGCAGA 


GCCTGTGGCC 


CAGGAGCGGC GTGGCAAGGC GGCCCGCATG 


720 


GCCGTGGTGG 


TGGCTGCTGT 


CTTTGGCATC 


AGCTTCCTGC CTTTTCACAT CACCAAGACA 


780 


GCCTACCTGG 


CAGTGCGCTC 


GACGCCGGGC 


GTCCCCTGCA CTGTATTGGA GGCCTTTGCA 


840 


GCGGCCTACA 


AAGGCACGCG 


GCCGTTTGCC 


AGTGCCAACA GCGTGCTGGA CCCCATCCTC 


900 


TTCTACTTCA 


CCCAGAAGAA 


GTTCCGCCGG 


CGACCACATG AGCTCCTACA GAAACTCACA 


960 


GACAAATGGC 


AGAGGCAGGG 


TCGC 




984 


(2) INFORMATION FOR SEQ ID NO: 12: 







(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1446 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 
(vii) IMMEDIATE SOURCE: 
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(A) LIBRARY: Mast Cell 

(B) CLONE: 8118 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

ATGGCGTCTT TCTCTGCTGA GACCAATTCA ACTGACCTAC TCTCACAGCC ATGGAATGAG 60 

CCCCCAGTAA TTCTCTCCAT GGTCATTCTC AGCCTTACTT TTTTACTGGG ATTGCCAGGC 120 

AATGGGCTGG TGCTGTGGGT GGCTGGCCTG AAGATGCAGC GGACAGTGAA CACAATTTGG 180 

TTCCTCCACC TCACCTTGGC GGACCTCCTC TGCTGCCTCT CCTTGGCCTT CTCGCTGGCT 240 

CACTTGGCTC TCCAGGGACA GTGGCCCTAC GGCAGGTTCC TATGCAAGCT CATCCCCTCC 300 

ATCATTGTCC TCAACATGTT TGGCAGTGTC TTCCTGCTTA CTGCCATTAG CCTGGATCGC 360 

TGTCTTGTGG TATTCAAGCC AATCTGGTGT CAGAATCATC GCAATGTAGG GATGGCCTGC 420 

TCTATCTGTG GATGTATCTG GGTGGTGGCT TTTGTGTTGT GCATTCCTGT GTTCGTGTAC 480 

CGGGAAATCT TCACTACAGA CAACCATAAT AGATGTGGCT ACAAATTTGG TCTCTCCAGC 540 

TCATTAGATT ATCCAGACTT TTATGGGGAT CCACTAGAAA ACAGGTCTCT TGAAAACATT 600 

GTTCAGCCGC CTGGAGAAAT GAATGATAGG TTAGATCCTT CCTCTTTCCA AACAAATGAT 660 

CATCCTTGGA CAGTCCCCAC TGTCTTCCAA CCTCAAACAT TTCAAAGACC TTCTGCAGAT 720 

TCACTCCCTA GGGGTTCTGC TAGGTTAACA AGTCAAAATC TGTATTCTAA TGTATTTAAA 780 

CCTGCTGATG TGGTCTCACC TAAAATCCCC AGTGGGTTTC CTATTGAAGA TCACGAAACC 840 

AGCCCACTGG ATAACTCTGA TGCTTTTCTC TCTACTCATT TAAAGCTGTT CCCTAGCGCT 900 

TCTAGCAATT CCTTCTACGA GTCTGAGCTA CCACAAGGTT TCCAGGATTA TTACAATTTA 960 

GGCCAATTCA CAGATGACGA TCAAGTGCCA ACACCCCTCG TGGCAATAAC GATCACTAGG 1020 

CTAGTGGTGG GTTTCCTGCT GCCCTCTGTT ATCATGATAG CCTGTTACAG CTTCATTGTC 1080 

TTCCGAATGC AAAGGGGCCG CTTCGCCAAG TCTCAGAGCA AAACCTTTCG AGTGGCCGTG 1140 

GTGGTGGTGG CTGTCTTTCT TGTCTGCTGG ACTCCATACC ACATTTGGGG AGTCCTGTCA 1200 

TTGCTTACTG ACCCAGAAAC TCCCTTGGGG AAAACTCTGA TGTCCTGGGA TCATGTATGC 1260 

ATTGCTCTAG CATCTGCCAA TAGTTGCTTT AATCCCTTCC TTTATGCCCT CTTGGGGAAA 1320 

GATTTTAGGA AGAAAGCAAG GCAGTCCATT CAGGGAATTC TGGAGGCAGC CTTCAGTGAG 1380 

GAGCTCACAC GTTCCACCCA CTGTCCCTCA AACAATGTCA TTTCAGAAAG AAATAGTACA 1440 

ACTGTG 1446 
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CLAIMS 

1. A method of extending the sequence of a partial complementary 
DNA (cDNA) using polymerase chain reaction (PCR) , comprising the 
steps of: 

5 a) combining a first and second PCR primer with nucleic acid 

from a cDNA library expected to contain said partial cDNA f or a 
genomic library, under conditions suitable for synthesis of 
nucleic acid PCR products from the first and second primers, 
wherein said first and second primers are capable of annealing to 

10 opposite strands of the partial cDNA or genomic DNA and initiating 
nucleic acid synthesis in an outward manner and wherein the first 
primer is capable of being extended by DNA polymerase in an 
antisense direction and the second primer is capable of being 
extended in a sense direction. 

15 b) purifying the PCR products, and 

c) identifying extended nucleotide sequences derived from 
said partial cDNA or said genomic DNA. 

2. The method of Claim 1 wherein identifying extended sequences 
comprises nucleic acid sequencing. 

20 3. The method of Claim 2 further comprising extending the 

nucleotide sequences of step 6c by repeating steps 6a through 
6c on the nucleotide sequences identified in step 6c. 
4. A method of extending the nucleotide sequence of a partial 
complementary DNA (cDNA) using polymerase chain reaction 
25 (PCR), comprising the steps of: 

a) combining a first and second PCR primer with nucleic acid 
from a cDNA library expected to contain said partial cDNA, or a 
genomic library, under conditions suitable for synthesis of 
nucleic acid PCR products from the first and second primers, 
30 wherein said first and second primers are capable of annealing to 
opposite strands of the partial cDNA or genomic DNA and initiating 
nucleic acid synthesis in an outward manner and wherein the first 
primer is capable of being extended by DNA polymerase in an 
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antisense direction and the second primer is capable of being 
extended in a sense direction. 

b) purifying the PCR products, 

c) ligating the purified PCR products under conditions 
suitable for the formation of circular closed nucleic acid, 

d) transforming a host cell with the circular closed nucleic 
acid and culturing the transformed host cell under conditions 
suitable for growth, 

e) recovering said circular closed nucleic acid from the 
cultured, transformed host cell, 

f) identifying extended nucleotide sequences derived from 
said partial cDNA or said genomic DNA. 

5. The method of Claim 4 wherein identifying extended sequences 
comprises nucleic acid sequencing. 

6. The method of Claim 4 wherein culturing the transformed host 
cell under conditions suitable for growth comrpises culturing 
in the presence of selective antibiotic conditions. 

7. The method of Claim 4 wherein said host cell is E.coli . 

8. The method of Claim 4 wherein after step 4b and prior to step 
4c, the purified PCR products are treated under conditions 
sutiable for converting nucleic acid overhangs to blunt ends. 
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Step 1 Partial cDNA sequence from public database or a researcher's 
earlier efforts 

\ 

Step 2 Two primers (XLR/XLS) designed based on partial sequence 

i 

Step 3 Amplification of plasmids containing the gene of interest 

i 

Step 4 Purification of 1he amplified DNA fragments 

i 

Step 5 Religation of the amplified DNA fragments to circular closed DNA 

t 

Step 6 Transformation of the circular closed DNA into E.coli cells 



Step 7 Growth of individual clones in liquid media under appropriate 
selection (e.g. Carb) ^ 

Step 8 PCR-screening of the individual clones for different insert sizes 
upstream of the XLR-priming site. 

i 

Step 9 Selection of clones for sequence analysis 

i 

Step 10 Sequencing of clones of interest 
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The purified DNA segments 
are religaled and form a 
circular plasmid 




t 



The circular closed piasmids 
are then transformed into 
Exoli and grown as colonies on 
LB agar 2xCarb plates 
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10 20 30 40 50 

Hsp 90 1 CTCCGGCGCA GTGTTGGGAC TGTCTGGGTA TCGGAAAGCA AGCCTACGTT 50 

14201 1 50 

14201.3 1 gCTGGGTA TCGGAAAGCA AGCCTACGTT 50 

14201.5 1 GTTGGGAC TGTCTGGGTA TCGGAAAGCA AGCCTACGTT 50 

14201.13 1 * 50 

60 70 80 90 100 

Hsp 90 51 GCTCACTATT ACGTATAATC C TTTT C TTT T CAAGATGCCT GAGGAAGTGC ■ 100 

14201 51 100 

14201.3 . 51 GCTCACTATT ACGTATAATC CTTTTCTNTN CAAGATGCCT GAGGAAGTGC 100 

14201.5 51 GCTCACTATT ACGTATAATC C TTTTCTTT T CAAGATGCCT GAGGAAGTGC 100 

110 120 130 140 150 

Hsp 90 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201 101 150 

14201.3 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201.5 101 ACCATGGAGA GGAGGAGGTG GAGACTTTTG CCTTTCAGGC AGAAATTGCC 150 

14201.13 101 150 

160 170 180 190 200 

Hsp 90 151 CAACTCATGT CCCTCATCAT CAATACCTTC TATTCCAACA AGGAGATTTT 200 

14201 151 200 

14201.3 151 CAACTCATGT CCCTCATCAT CAATACCTCC TATTCCAACA AGGAGATTNT 200 

14201.5 151 CAACTCATGT CCCTCATCAT CAATACCTCC TATTCCAACA AGGAGATTTT 200 

14201.13 151 200 

210 220 230 240 250 

Hsp 90 201 CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC AAGATTCGCT 250 

14201 201 250* 

14201.3 201 CCTNCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTCGGAC AAGATTCGCT 250 

14201.5 201 CCTTCGGGAG TTGATCTCTA ATGCTTCTGA TGCCTTGGAC AAGATTCGCT 250 

14201.13 201 250 

260 270 280 290 300 

Hsp 90 251 ATGAGAGCCT GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 300 

14201. 251 300 

14201.3 251 ATGANAGCCT GACAGACCCT TCGAAGTNGG TCAGCGGCAA NGAGCTGAAA 300 

14201.5 251 ATGAGAGCCT GACAGACCCT TCGAAGTTGG ACAGTGGTAA AGAGCTGAAA 300 

14201.13 251 300 
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310 320 330 340 350 

Hsp 90 301 ATTGACATCA TCCCCAACCC TCAGGAACGT ACCCTGACTT TGGTAGACAC 350 

14201 301 : 350 

14201.3 301 ATTGACATCA TCCCCAACCC TCAGGAACGT NCCCTGACTT TGGTAGACAC 350 

14201.5 301 ATTGACATCA TCCCCAACCC TCAGGAACGT ACCCTGACTT TGGTAGACAC 350 

14201.13 301 350 

360 370 380 390 400 

Hsp 90 351 AGGCATTGGC ATGACCAAAG CTGATCTCAT AAaTAATTtG GGAACCATTG 400 

14201 351 400 

14201.3 351 AGGCATTGGC ATGAaacAAG CTGAcCTCAT NAnTTATTcG GGgAaCcaTt 400 

14201.5 351 AGGCATcGGC ATGACCAAAG CTGATCTCAT AAnTAATTnG GGAACCATTG 400 

14201.13 351 400 

410 420 430 440 450 

Hsp 90 401 CCAAGTCTGG TACTAAAGCA TTCATGGAGG CTCTTCAGGC TGGTGCAGAC 450 

14201 401 4 f° 

14201.3 401 CCAAGTCTTG TNCTAAAGCA TTCATGGAGG CTCTNCAGGN TGGcGCAGAC 450 

14201.5 401 NCAAGTCTGG TACTAAAGCA TTCATGGAGG CTCTTCAGGC TGGTGCAGAC 450 

14201.13 401 450 

460 470 480 490 500 

Hsp 90 451 ATCTCCATGA TTGGGCAGTT tGGTGTTGGC TttTATTCTG CCTACTTGGT 500 

14201 451 • 500 

14201.3 451 ATCTCCANGA TTNGGCAGNT GGGTGTTGGC TTnTATTCTG CCcACTTGGT 500 

14201.5 451 ATCTCCATGA TTGGGCAGTT GGGTGTTGNC TTnTATTCTG CCTcCTTGGT 500 

14201.13 451 500 

510 520 530 540 550 

Hsp 90 501 GGCAGAGAAA GTGGTTGTGA TCAGAAAGCA CAACGATGAT GAacAGTATG 550 

14201 501 550 

14201.3 501 GGCAGAGAAA NNT 550 

14201.5 501 GGCAGAGAAA GTNGTTGTGA TCA 550 

14201.13 501 TT GAgnAGTATG 550 

560 570 580 590 600 

Hsp 90 551 cTtgGgAGTc TtCTGcTGGA GGTTCCTTCA CTgtGCGTGC TGACcATGGT 600 

14201 . 551 600 

14201.3 551 600 

14201.5 551 600 

14201.13 551 -TcnGnAGT- TaCTGnTGGA GGTTCCTTCA CTnnGCGTGC TGAC-ATGGT 600 

610 620 630 640 650 

Hso 90 601 GAGCCCATtG GcAtgGGTAC CAaAGTGATC CTCCATCTtA AAGAAGATCA 650 

14201 601 650 

14201.3 601 650 

14201.5 601 650 

14201.13 601 GAGCCCATnG GgAggGGTAC CAnAGTGATC CTCCATCTcA AAGAAGATCA 650 
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660 670 680 690 700 

Hsp 90 651 GACAGAGTAC CTAGAaGAGA GGCGGgTCAA AGaAGTAGTG AaGAaGCATT 700 

14201 651 700 

14201.3 651 700 

14201.5 651 700 

14201.13 651 GACAGAGTAC CTAGAnGAGA GGCGGaTCAA AGnAGTAGTG AtGAnGCATC 700 

710 720 730 740 750 

Hsp 90 701 CTCAGtTCAT AGGCTATCCC ATCACCCTTT aTTTGGAGAA GGaACGAGAG 750 

14201.3 701 750 

14201.5 701 , 750 

14201.13 701 CTCAGaTCAT AGGCTATCCC ATCACCCTTT nTTTGGAGAA GGnACGAGAG 750 

760 770 780 790 800 

Hsp 90 751 AAGGAaATTA GtGATGATGA GGCAGAGGAA GAGAAaGGTG AGAAaGAAGA 800 

14201 751 800 

14201.3 751 800 

14201.5 751 800 

14201.13 751 AAGGAnATTA GnGATGATGA GGCAGAGGAA GAGAAtGGTG AGAAtGAAGA 800 

810 820 830 840 850 

Hsp 90 801 GGAaGaTAAa GATGATGAAG AAAagCCCAA GATCGAaGAT GTGGgTTCAG 850 

14201 801 850 

14201.3 801 850 

14201.5 801 850 

14201.13 801 GGAnGnTAAc GATGATGAAG AAAncCCCAA GATCGAtGAT GTGGnTTCAG 850 

860 870 880 890 900 

Hsp 90 851 ATGAGGaGGA TGACAGCGGT aAgGATAAGA AGAAGAAaAC TAaGAagATC 900 

14201 • 851 900 

14201.3 851 900 

14201.5 851 900 

14201.13 851 ATGAGGnGGA TGACAGCGGT nAnGATAAGA AGAAGAAnAC TAnGAnnATC 900 

910 920 930 940 950 

Hsp 90 901 AAAGAGAAAT ACATTGATCA GGAAGAACTA AACAAGACCA AGCCTATTTG 950 

14201.3 901 950 

14201.5 901 950 

14201.13 901 950 

960 970 980 990 1000 

Hsp 90 951 GACCAGAAAC CCTGATGACA TCACCCAAGA GGAGTATGGA GAATTCTACA 1000 

14201.3 951 1000 

14201:5 951 1000 

14201.13 951 1000 
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Hsd 90 

14201 

14201.3 

14201.5 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



1010 1020 1030 1040 1050 

1001 AGAGCCTCAC TAATGACTGG GAAGACCACT TGGCAGTCAA GCACTTTTCT 1050 

looi - . 1050 

■;™r loso 

1001 • 1050 

1060 1070 1080 1090 1100 

1051 GTAGAAGGTC AGTTGGAATT CAGGGCATTG CTATTTATTC CTCGTCGGGC 1100 

1051 noo 

1051 1100 

losi noo 

1051 1100 

1110 1120 1130 1140 1150 

1101 TCCCTTTGAC C T TT TT GAGA ACAAGAAGAA AAAGAACAAC ATCAAACTCT 1150 

1101 AAGAA AAAGAACAAC ATCAAACTCT 1150 

HOI ; ; 1150 

HOI 1150 

HOI • 1150 



1160 1170 1180 1190 1200 

Hs P 90 1 1 51 . ATGTCCGCCG TGTGTTCATC ATGGaCAGCT GTGATGAGTT GATACCAGAG 

14201 H51 ATGTCCGCCG TGTGTTCATC ATGGnCAGCT GTGATGAGTT GATACCAGAG 

14201.3 1151 

14201.5 1151 . . 

14201.13 1151 



1200 
1200 
1200 
1200 
1200 



1210 1220 1230 1240 1250 

Hs P 90 1201 TATCTCAATT TTATCCGTGG TGTGGTTGAC TcTGAGGaTC TGCCCCTGAA 1250 

1 42 °1 1201 TATCTCAATT TTATCCGTGG TGTGGTTGAC TnTGAGGnTC TGCCCCTGAA 1250 

14201.3 1201 1250 

14201.5 1201 " 1250 

14201.13 1201 !!..!!!! 1250 

1260 1270 1280 1290 1300 

Hsp 90 1251 CATCTCCCGa GAAATGCTCC AGCAGAGCAA AATCTTGAAA GtCATTCGCA 1300 

14201 1251 CATCTCCCGn GAAATGCTCC AGCAGAGCAA AATCTTGAAA GgCATTCGCA 1300 

14201.3 1251 1300 

14201.5 1251 1300 

14201.13 1251 1300 

1310 1320 1330 1340 1350 

Hsp 90 1301 AAAACATTGT TAAGaAGTGC CTTgAGCTCT TCTCTgAGCT GGCAGAAGaC .1350 

14201 1301 AAAACATTGT TAAGnAGTGC CTTnAGCTCT TCTCTnAGCT GGCAGAAGnC 1350 

14201.3 1301 ' 1350 

14201.5 1301 1350 

14201.13 1301 1350 
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Hsp 90 

14201 

14201.3 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



1360 1370 1380 1390 1400 

1351 AAGGAGAATT ACAAGAAATT CTATGAGGCA TTCTCTAAAA ATCTCAAGCT 1400 

1351 AAGG-GGATT TCAAGAAATT CTTTGGGG lJOjj 

1351 {Ill 

1351 ^ 

1410 1420 1430 1440 1450 

1401 TGGAATCCAC GAAGACTCCA CTAACCGCCG CCGCCTGTCT GAGCTGCTGC 1450 

„ M ^ 1450 

1401 145Q 

"~ 1450 

14 °! ... 1450 

1401 

1460 1470 1480 14 90 1500 

1451 GCTATCATAC CTCCCAGTCT GGAGATGAGA TGACATCTCT GTCAGAGTAT 1500 

1451 — 

"51 lloo 

1451 lioo 

1451 •"■ 

1510 1S20 1530 1540 1550 

1501 GTTTCTCGCA TGAAGGAGAC ACAGAAGTCC ATCTATTACA TCACTGGTGA 1550 

isoi isso 

.. 1550 

I 501 .... 1550 

1501 

1560 1570 1580 1590 1600 

1551 GAGCAAAGAG CAGGTGGCCA ACTCAGCTTT TGTGGAGCGA GTGCGGAAAC 1600 

:::::::::: :::::::::: :::::::::: i6 00 

16°° 

:::::::::: :::: 1600 

. 1610 1620 1630 1640 .1650 

1601 GGGGCTTCGA GGTGGTATAT ATGACCGAGC CCATTGACGA GTACTGTGTG 1650 

_ — IddQ 

"01 1650 

"01 1650 

1601 • 1650 

1601 
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1660 1670 1680 1690 .1700 

Hsp 90 1651 CAGCAGCTCA AGGAATTTGA TGGGAAGAGC CTGGTCTCAG TTACCAAGGA 1700 

14201 1651 • 1700 

14201.3 1651 1700 

14201.5 1651 1700 

14201.13 1651 - 1700 

1710 1720 1730 1740 1750 

Hsp 90 1701 GGGTCTGGAG CTGCCTGAGG ATGAGGAGGA GAAGAAGAAG ATGGAAGAGA 1750 

14201 1701 1750 

14201.3 1701 1750 

14201.5 1701 1750 

14201.13 1701 1750 

1760 1770 1780 1790 1800 

Hsp 90 1751 GCAAGGCAAA GTTTGAGAAC CTCTGCAAGC TCATGAAAGA AATCTTAGAT 1800 

14201 1751 1800 

14201.3 1751 1800 

14201.5 1751 1800 

14201.13 1751 1800 

1810 1820 1830 1840 1850 

Hsp 90 1801 AAGAAGGTTG AGAAGGTGAC AATCTCCAAT AGACTTGTGT CTTCACCTTG 1850 

14201 1801 . 1850 

14201.3 1801 1850 

14201.5 1801 1850 

14201.13 1801 1850 

1860 1870 1880 1890 1900 

Hsp 90 1851 CTGCATTGTG ACCAGCACCT ACGGCTGGAC AGCCAATATG GAGCGGATCA 1900 

14201 1851 1900 

14201.3 1851 .: 1900 

14201.5 1851 1900 

14201.13 1851 1900 

1910 1920 1930 1940 1950 

Hsp 90 1901 TGAAAGCCCA GGCACTTCGG GACAACTCCA CCATGGGCTA TATGATGGCC 1950 

14201 1901 1950 

14201.3 1901 1950 

14201.5 1901 1950 

14201.13 1901 1950 

1960 1970 1980 1990 2000 

Hsp 90 1951 AAAAAGCACC TGGAGATCAA CCCTGACCAC CCCATTGTGG AGACGCTGCG 2000 

14201 .1951 2000 

14201.3 1951 2000 

14201.5 1951 : 2000 

14201.13 1951 2000 
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Hsp 90 

14201 

14201.3 

14201.5 

14201.13 



■ Hsp 90 
14201 
14201.3 
14201.5 
14201.13 



2010 2020 2030 2040 2050 

2001 GCAGAAGGCT GAGGCCGACA AGAATGATAA GGCAGTTAAG GACCTGGTGG 

2001 

2001 

2001 

2001 

2060 2070 2080 . 2090 2100 

2051 TGCTGCTGTT TGAAACCGCC CTGCTATCTT CTGGCTTTTC CCTTGAGGAT 

2051 

2051 

2051 

2051 

2110 2120 2130 2140 2150 

2101 CCCCAGACCC ACTCCAACCG CATCTATCGC ATGATCAAGC TAGGTCTAGG 

2101 

2101 

2101 

2101 

2160 2170 2180 2190 2200 

2151 TATTGATGAA GATGAAGTGG CAGCAGAGGA ACCCAATGCT GCAGTTCCTG 

2151 

2151 

2151 

2151 

2210 2220 2230 2240 2250 

2201 ATGAGATCCC CCCTCTCGAG GGCGATGAGG ATGCGTCTCG CATGGAAGAA 

2201 

2201 

2201 

2201 

2260 2270 2280 2290 2300 

2251 GTCGATTAGG TTAGGAGTTC ATAGTTGGAA AACTTGTGCC CTTGTATAGT 

2251 

2251 

2251 

2251 

2310 2320 2330 2340 2350 

2301 GTCCCCATGG GCTCCCACTG CAGCCTCGAG TGCCCCTGTC CCACCTGGCT 

2301 

2301 

2301 

2301 



2050 
2050 
2050 
2050 
2050 



2100 
2100 
2100 
2100 
2100 



2150 
2150 
2150 
2150 
2150 



2200 
2200 
2200 
2200 
2200 



2250 
2250 
2250 
2250 
2250 



2300 
2300 
2300 
2300 
2300 



2350 
2350 
2350 
2350 
2350 
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2360 2370 2380 2390 2400 

Hsp 90 2351 CCCCCTGCTG GTGTCTAGTG TTTTTTT CCC TCTCCTGTCC TTGTGTTGAA 2400 

14201 2351 2400 

14201,3 2351 2400 

14201.5 2351 2400 

14201.13 2351 2400 

2410 2420 2430 2440 2450 

Hso 90 2401 GGCAGTAAAC TAAGGGTGTC AAGCCCCATT CCCTCTCTAC TCTTGACAGC 2450 

14201 2401 2450 

14201.3 2401 2450 

14201.5 2401 2450 

14201.13 2401 2450 

2460 2470 2480 2490 2500 

HSO 90 2451 AGGATTGGAT GTTGTGTATT GTGGTTTATT TTATTTTCTT CATTTTGTTC 2500 

14201 2451 2500 

14201.3 2451 2500 

14201.5 2451 2500 

14201.13 2451 2500 

2510 2520 2530 2540 2550 

Hsp 90 2501 TGAAATTAAA GTATGCAAAA TAAAGAATAT GCCGTTTTTA TAC 2550 

14201 2501 2550 

14201.3 2501 2550 

14201.5 2501 2550 

14201.13 2501 2550 
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10 20 30 40 50 

1 TCCGGCAACG CCAACCGCTC CGCTGCGCGC AGGCTGGGCT GCAGGCTCTC 



60 70 80 90 100 

51 GGCTGCAGCG CTGGGCTGGT GTGCAGTGGT GCGACCACGG CTCACGGCAG 

5! NCN GGTTGAGNAT TCGGACNAGT CCGAAAACGT CCGGCAAGTC 

HO 120 130 140 150 

101 CCTCAGCCAC CCAGATGTAA GCGATCTGGT TCCCACCTCA GCCTCCCGAG 

ioi z iizrizzzzz zzzzzzzzzz zzzzzzzzzz 

101 



101 ACCCGCTCCG CTGNGCGCAG GCTGGGNTGC AGGCTCTCGG NTGCAGNGCT 

160 170 180 190 200 

151 TAGTGGATCT AGGATCCGGC TTCCAACATG TGGCAGcTCT GGGCCTCCCT 

151 zzzzzz zzzzzzzzzz zzzzzzzzzz 

151 

151 



201 
201 
201 
201 
201 



251 
251 
251 
251 
251 



GGGTGGATCT AGGATCCGGC TTCCAACATG TGGCAGtTCT GGGCCTCCCT 

210 220 230 240 250 

CTGcTGCCTG CTGGTGTTGG cCA^TGCCCG GAGcAGGcCC TCTTTCCATC 



CTGnTGCCTG CTGGTGTTGG aCAATGCCCG GAGgAGGnCC TCTTTCCATC 

260 270 280 290 300 

CCCTGTCGGA TGAGCTGGTC AaCTATGTCA ACAAACGGAA TACCACGTGG 



CCCTGTCGGA TGAGCTGGTC AnCTATGTCA ACAAACGGAA TACCACGTGG 



50 
50 
50 
50 
50 



100 
100 
100 
100 
100 



150 
150 
150 
150 
150 



200 
200 
200 
200 
200 



250 
250 
250 
250 
250 
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capthepsin 

87058 
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310 320 330 340 350 
301 cAGGCCGGaA ACMCTTCTA CAACGTGGAC ATGAGCTACT TGAaGAGGcT 
301 

301 

301 

301 nAGGCCGGgA ACAACTTCTA CAACGTGGAC ATGAGCTACT TGAnGAGGnT 

360 370 380 390 400 

351 ATGTGGTACC TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA 
35l ———— — — ™— 

351 

351 GaGGTACC TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTATGTTTA 
351 ATGTGGTACC TTCCTGGGTG GGCCCAAGCC ACCCCAGAGA GTTNTGTTTA 

410 420 430 440 450 

401 CCGAGGACCT GAAGCTGCCT GCAAGCTTCG ATGCACGGGA ACAATGGCCA 

401 

401 

401 CCGAGGACCT GAAGCTGCCT GCAAGCTTCG ATGCACGGGA ACAATGGCCA. 
401 CCGAGGACCT GANGCTGCCT GCAAGCTTCG AaGgACGGGA ACAATGGCCA 



350 
350 
350 
350 
350 



400 
400 
400 
400 
400 



450 
450 
450 
450 
450 
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capthepsin 

87058 

87058.6 
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460 470 480 490 500 

451 CAGTGTCCCA CCATCAAAGA GATCAGAGAC CAGGGCTCCT GTGGCTCCTG 

451 

451 

451 CAGTGTCCCA CCATCAAAGA GATCAGAGAC CAGGGNTCCT GTGGCTCCTG 
451 CAGTGTCCCA CCATCAAAGA GATCAGAGAN CAGGGCTCCT GTGGNTCCTG 

510 520 530 540 550 

501 CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC TGCATCCACA 

501 — ■ zzzzzzzzzz 

501 CTGGGCCTTC GGGGCTGTGG AAGCCATCTC TGACCGGATC TGNATCCACA 
501 CTGGGCCTCC GGGGCTGTGG AAGNCATCTC TGACCGGATC TGCATCCACA 



560 570 580 590 600 

capthepsin 551 CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CGGAGGACCT GCTCACATGC 

87058 551 

87058.6 551 

87058.8 551 CCAATGCGCA CGTCAGCGTG GAGGTGTCGG CGGAGGAC-T GCTCACATGC 

87058.16 551 CCAATGNGCA CGTCAGCGTG GtGGTGTCGG NGGAGGACCT GaTCACCTNt 



500 
500 
500 
500 
500 



550 
550 
550 
550 
550 



600 
600 
600 
600 
600 



610 620 630 640 650 

capthepsin 601 TGTGGCAGCA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC 650 

87058 601 ■ 650 

87058.6 601 gTGAAGC 650 

87058.8 601 TGTGGCAGNA TGTGTGGGGA CGGCTGTAAT GGTGGCTATC CTGCTGAAGC 650 

87058.16 601 TGTGGtAGCA TGTGTGGGGA CGGCTGTAAT GGTGGtTATC CTGNTGAAGC 650 
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caothepsin 

87058 

87058.6 

87058.8 

87058.16 



660 670 680 690 700 

651 TTGGAACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT 

651 TTGGAACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGAAT 
651 TTGGNACTTC TGGACAAGAA AAGGCCTGGT TTCTGGTGGC CTCTATGANT 
651 TNGGgNCTTC TNagaAAGAA AAGGCtNGtT TT GGTGGC CT-TATGAcT 

710 720 730 740 750 

701 CCCATGTAGG GTGCAGACCG TACTCCATCC CTCCCTGTGA GCACCACGTC 

701 

701 CCCATGTAGG GTGCAGACCG TACTCCATCC CTCCCTGTGA GCACCACGTC 
701 CCCATGTAGG GTGTAGACCG TACTCCATCC CTCCCTGTGA GCACCACGTC 
701 CCCATGT 

760 770 780 790 BOO 

751 AACGGCTCCC GGCCCCCATG CACGGGGGAG GGAGATACCC CCAAGTCTAG 

751 AACGGCTCCC GGCCCCCATG CACGGGGGAG GGAGATACCC CCAAGTGTAG 
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